APPLICATION NOTE Visualization for genomics: the Microbial Genome Viewer

نویسندگان

  • Robert Kerkhoven
  • Frank H. J. van Enckevort
  • Jos Boekhorst
  • Douwe Molenaar
  • Roland J. Siezen
چکیده

Summary: A web-based visualization tool, the Microbial Genome Viewer, is presented which allows the user to combine complex genomic data in a highly interactive way. This web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a MySQL database. The generated images are in Scalable Vector Graphics (SVG) format, which is suitable for creating high-quality scalable images and dynamic web representations. Gene-related data such as transcriptome and time-course microarray experiments can be superimposed on the maps for visual inspection. Availability: The Microbial Genome Viewer 1.0 is freely available at http://www.cmbi.kun.nl/MGV Contact: [email protected] INTRODUCTION The ongoing data explosion in the field of genomics, transcriptomics and metabolomics has created the need for data-handling tools which are able to reveal complex relations and patterns. In the field of genomics, much effort has been paid to structuring and standardization of data. The need for world-wide standards for data exchange is clearly represented by the numerous projects defining protocols for information storage in XML. Besides the growing need for structuring the data explosion, visualization of this data is an essential step in the flow of knowledge. As yet, not many free software tools are available for interactive visualization of circular or linear genome maps. Programs like GenomePlot (Gibson et al., 2003) and GenoMap (Sato et al, 2003) are used for creating chromosome wheels and linear maps of genomic data stored in tabdelimited or GenBank/EMBL format respectively. Both are stand-alone programs; the latter can also plot microarray expression or other types of quantitative data. Disadvantages of both programs are the lack of interactivity with the pictures and the requirement of specific input formats. A non-interactive structural DNA analysis and visualization is provided for all sequenced genomes by GenomeAtlas (Pedersen & Jensen et al., 2000) through a dedicated web-interface. Additionally, commercial software exists like GenoStar and DNASTAR, which include genome visualization and analysis. Bioinfor matics © Oxford University Press 2004; all rights reserved. Bioinformatics Advance Access published February 26, 2004 by gest on Jauary 1, 2016 httpioinform atics.oxjournals.org/ D ow nladed from IMPLEMENTATION Here we present the Microbial Genome Viewer (MGV), a web-based tool for interactive visualization of annotation and transcriptome data on chromosome wheels and linear genome maps. Genome annotation data is retrieved from a local MySQL database enabling rapid visualization. Data includes all the relevant information on the complete genomes from the Genbank/EMBL/DDBJ databases. Several supplementary annotation methods have been automated in order to enrich this database. Terminator structures were added with TransTerm (Ermolaeva et al., 2000) and all organisms were scanned for COG (Tatusov et al., 2001) and Pfam domains (Bateman et al., 2002) using a Paracel GeneMatcher2 machine. Scalable Vector Graphics (SVG) was chosen as the format for visualization. SVG is a language for describing two-dimensional images in XML and has been recommended by the W3C consortium (http://www.w3c.org/SVG). The possibility of adding animation, interaction and the supplementation of scripting languages makes SVG highly suitable for the visualization of complex data generated by the genomics community. Other advantages are the small file size and scalability for obtaining high-resolution images for posters and publications. SVG plugins for a wide range of platforms are available (http://www.adobe.com/SVG). Genes can be colored according to different annotation methods. The classification of COG-domains in functional categories enables functional coloring of the genome maps (Tatusov et al., 2001). Numerical data like GC%, GC-skew, AT-skew and data from microarray experiments can be visualized with a color gradient. Microarray data is accepted in multiple formats. Additionally, manual selection and coloring can be made based on ORF numbers, keywords or customized functional categories. Gene annotation data and external databases like the NCBI are cross-linked with the genome maps by mouse-over functions. Examples of pictures generated with the Microbial Genome Viewer are shown in Fig. 1; more examples can be found at http://www.cmbi.kun.nl/MGV/examples. FUTURE Subsequent versions of the Microbial Genome Viewer will enable the combinatorial visualization of the genome maps with metabolic pathways and gene regulatory networks. Different levels of visualization will ultimately be connected by transcriptome and metabolome data. ACKNOWLEDGEMENTSSupported by the Netherlands Ministry of Economic Affairs, IOP Programme, grantIGE01018, and the Netherlands Organisation of Scientific Research (NWO)BioMolecular Informatics Programme, grants 050.50.206 & 700.51.103 (ParacelGeneMatcher2). REFERENCESBateman,A., Birney,E., Cerruti,L., Durbin,R., Etwiller,L., Eddy,S.R., Griffiths-Jones,S.,Howe,K.L., Marshall,M., Sonnhammer,E.L. (2002) The Pfam protein families database.Nucleic Acids Res., 30, 276-280.bygestonJauary1,2016httpioinformatics.oxjournals.org/Downladedfrom Ermolaeva,M.D., Khalak,H.G., White,O., Smith,H.O., Salzberg,S.L. (2000) Prediction oftranscription terminators in bacterial genomes. J. Mol. Biol., 301, 27-33. Gibson, G., Smith, D.R., (2003) Genome visualization made fast and simple.Bioinformatics, 19, 1339-1450. Kleerebezem,M., Boekhorst,J., van Kranenburg,R., Molenaar,D., Kuipers,O.P., Leer,R.,Tarchini,R., Peters,S.A., Sandbrink,H.M., Fiers,M.W., Stiekema,W., Lankhorst,R.M.,Bron,P.A., Hoffer,S.M., Groot,M.N., Kerkhoven,R., de Vries,M., Ursing,B., deVos,W.M., Siezen,R.J. (2003) Complete genome sequence of Lactobacillus plantarumWCFS1, Proc. Natl. Acad. Sci. USA, 100, 1990-1995. Pedersen,A.G., Jensen,L.J., Brunak, S., Staerfeldt, H.H., Ussery, D.W. (2000) A DNAstructural atlas for Escherichia coli. J. Mol. Biol., 299, 907-930. Sato,N., Ehira,S., (2003) GenoMap, a circular genome data viewer. Bioinformatics, 19,1583-1584. Tatusov,R.L., Natale,D.A., Garkavtsev,I.V., Tatusova,T.A., Shankavaram,U.T.,Rao,B.S., Kiryutin,B., Galperin,M.Y., Fedorova,N.D., Koonin,E.V. (2001) The COGdatabase: new developments in phylogenetic classification of proteins from completegenomes. Nucleic Acids Res., 1, 22-28. Yoshida,K., Kobayashi,K., Miwa,Y., Kang,C.M., Matsunaga,M., Yamaguchi,H., Tojo,S.,Yamamoto,M., Nishi,R., Ogasawara,N., Nakayama,T., Fujita,Y. (2001) Combinedtranscriptome and proteome analysis as a powerful approach to study genes underglucose repression in Bacillus subtilis. Nucleic Acids Res., 29, 683-692.bygestonJauary1,2016httpioinformatics.oxjournals.org/Downladedfrom

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visualization for genomics: the Microbial Genome Viewer

SUMMARY A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a MySQL database. The generated images are in scalable vector graphics (SVG) format, which is suitable...

متن کامل

Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration

Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today's sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive u...

متن کامل

Analysis of common k-mers for whole genome sequences using SSB-tree.

As sequenced genomes become larger and sequencing process becomes faster, there is a need to develop a tool to analyze sequences in the whole genomic scale. However, on-memory algorithms such as suffix tree and suffix array are not applicable to the analysis of whole genome sequence set, since the size of individual whole genome ranges from several million base pairs to hundreds billion base pa...

متن کامل

3-D Visualization for Gene Rearrangement in Ternary Comparison

An increase of full genome sequences from wide range of species enables “comparative genomics”, the study of similarities and differences in structure and functions of genetic information across taxa. One line of research has focused on the conservation of order of genes within gene clusters in various species of bacteria [1]. However, the results of these studies were sometimes inconsistent wi...

متن کامل

CARMEN - Comparative Analysis and in silico Reconstruction of organism-specific MEtabolic Networks.

New sequencing technologies provide ultra-fast access to novel microbial genome data. For their interpretation, an efficient bioinformatics pipeline that facilitates in silico reconstruction of metabolic networks is highly desirable. The software tool CARMEN performs in silico reconstruction of metabolic networks to interpret genome data in a functional context. CARMEN supports the visualizatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003